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Maxwell's demon was born in 1867 and still thrives in modern physics. He plays important roles 
in clarifying the connections between two theories: thermodynamics and information. Here, we 
present the history of the demon and a variety of interesting consequences of the second law of 
thermodynamics, mainly in quantum mechanics, but also in the theory of gravity. We also highlight 
some of the recent work that explores the role of information, illuminated by Maxwell's demon, in 
the arena of quantum information theory. 
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I. INTRODUCTION 

The main focus of this article is the second law of ther- 
modynamics in terms of information. There is a long 
history concerning the idea of information in physics, es- 
pecially in thermodynamics, because of the significant 
resemblance between the information theoretic Shannon 
entropy and the thermodynamic Boltzmann entropy, de- 
spite the different underlying motivations and origins of 
the two theories (See for example Ref. [l| and references 
therein). A definitive discovery in this context is Lan- 
dauer's erasure principle, which clearly asserts the rela- 
tionship between information and physics. As we will 
see below, having been extended to the quantum regime 
by Lubkin 2], it has proven useful in understanding the 
constraints on various information processing tasks from 
a physical point of view. 

Here we will review the background on the correspon- 
dence between information and physics, in particular 
from the point of view of thermodynamics. Starting with 
the classic paradox of Maxwell's demon, we will discuss 
the erasure principle from the perspectives that will be 
of interest to us in later sections. Then we shall review 
a variety of consequences of the second law, mainly in 
quantum mechanics, and also briefly in the theory of 
gravity. These are thought-provoking because the sec- 
ond law of thermodynamics is a sort of meta-rule, which 
holds regardless of the specific dynamics of the system we 
look at. Besides, the fundamental postulates of quantum 
mechanics, as well as general relativity, do not presume 
the laws of thermodynamics in the first place. After re- 
viewing these classic works and appreciating how power- 
ful and how universal the second law is, we will discuss 
some of recent progress concerning the intriguing inter- 
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play between information and thermodynamics from the 
viewpoint of quantum information theory. 



II. MAXWELL'S DEMON 
A. The paradox 

A character who has played an important role in the 
history of physics, particularly in thermodynamics and 
information, is Maxwell's demon. It was first introduced 
by Maxwell in 1871 to discuss the "limitations of the 
second law of thermodynamics", which is also the title 
of a section in his book. The second law (in Clausius's 
version) states "It is impossible to devise an engine 
which, working in a cycle, shall produce no effect other 
than the transfer of heat from a colder to a hotter body. " 
Maxwell devised his demon in a thought experiment to 
demonstrate that the second law is only a statistical prin- 
ciple that holds almost all the time, and not an absolute 
law set in stone. 

The demon is usually described as an imaginary tiny 
being that operates a tiny door on a partition which sep- 
arates a box into two parts of equal volumes, the left and 
the right. The box contains a gas which is initially in 
thermal equilibrium, i.e. its temperature T is uniform 
over the whole volume of the box. Let (v)t denote the 
average speed of molecules that form the gas. The de- 
mon observes the molecules in the left side of the box, 
and if he sees one approaching the door with a speed less 
than {v)t, then he opens the door and lets the molecule 
go into the right side of the box. He also observes the 
molecules in the right, and if he sees one approaching 
with a speed greater than (v)t, then he opens the door 
to let it move into the left side of the box. 

Once he has induced a small difference in temperatures 
between the right and the left, his action continues to 
transfer heat from the colder part (right) to the hotter 
part (left) without exerting any work, thus he is breaking 
Clausius's form of the second law. This type of demon is 
referred to as the temperature demon. 

There is another type of Maxwell's demon, who is "less 
intelligent" than the temperature demon. Such a demon 
merely allows all molecules moving in one direction to 
go through, while stopping all those moving the other 
way to produce a difference in pressure. This pressure 
demon runs a cycle by making the gas interact with a 
heat bath at a constant temperature after generating a 
pressure inequality. The sole net effect of this cycle is 
the conversion of heat transferred from the heat bath to 
work. This is also a plain violation of the second law, 
which rules out perpetuum mobile (in Kelvin's form): 
"It is impossible to devise an engine which, working in 
a cycle, shall produce no effect other than the extraction 
of heat from a reservoir and the performance of an equal 
amount of mechanical work. " 

The second law can also be phrased as "in any cyclic 
process the total entropy of the physical systems involved 



in the process will either increase or remain the same". 
Entropy is, in thermodynamics, a state variable S whose 
change is defined as dS = SQ/T for a reversible process 
at temperature T, where 6Q is the heat absorbed. Thus, 
irrespective of the type of demon, temperature or pres- 
sure, what he attempts to do is to decrease the entropy 
of the whole system for the cyclic process. 

Historically, a number of physical mechanisms that 
might emulate the demon without any intelligent beings 
have also been proposed. One notable example should 
be the trap-door model by Smoluchowski 5]. Instead of 
an intelligent demon operating the door, he considered 
a door that is attached to the partition by a spring so 
that it only opens to one side, the left, say. Then fast 
moving molecules in the right side can go into the left 
side by pushing the door, but slow ones are simply re- 
flected as the door is shut tightly enough for them and 
no molecules can go into the right from the left. After 
a while, the temperature (as well as the density) of the 
left side should become higher and the right side lower. 
Useful work would be extracted from this spontaneously 
generated temperature difference. Smoluchowski pointed 
out that what prevents the trap-door mechanism from 
achieving the demonic work are thermal fluctuations, i.e. 
Brownian motion, of the door. The door might be kicked 
sometimes to let a fast molecule in; however, the thermal 
fluctuations will lift the door up and let molecules go back 
to the right side, resulting in no net temperature differ- 
ence. This scenario was numerically analyzed in detail 
in Ref. ,0] to confirm the above reasoning. While there 
have been many other similar mechanisms proposed, a 
more sophisticated model was discussed by Feynman as 
the ratchet-and-pawl machine [7j], which, again, does not 
work as a perpetual engine due to the thermal fluctua- 
tions. 

The demon puzzle, which had been a cardinal question 
in the theory of thermodynamics, is now why a demon 
can never operate beyond the apparently fundamental 
limits imposed by the second law, no matter how intel- 
ligent he is and no matter what type (temperature or 
pressure) he is. An ingenious idea by Szilard treated the 
demon's intelligence as information and linked it with 
physics. 



B. Szilard's engine 

In 1929, the Hungarian scientist Leo Szilard presented 
a classical (non-quantum) analysis of Maxwell's demon 
(pressure demon), formulating an idealized heat engine 
with one-molecule gas jaj. Szilard's work was epoch- 
making in the sense that he explicitly pointed out, for 
the first time, the significance of information in physics. 

The process employed by Szilard's engine is schemati- 
cally depicted in Fig. [TJ A chamber of volume V contains 
a gas, which consists of a single molecule (Fig. (Ha)). As 
a first step of the process, a thin, massless, adiabatic 
partition is inserted into the chamber quickly to divide 
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FIG. 1: Schematic diagram of Szilard's heat engine. A cham- 
ber of volume V contains a one-molecule gas, which can be 
found in either the right or the left part of the box. (a) Ini- 
tially, the position of the molecule is unknown, (b) Maxwell's 
demon inserts a partition at the centre and observes the 
molecule to determine whether it is in the right or the left 
hand side of the partition. He records this information in his 
memory, (c) Depending on the outcome of the measurement 
(which is recorded in his memory), the demon connects a load 
to the partition. If the molecule is in the right part as shown 
in the figure, he connects the load to the right hand side of 
the partition, (d) The isothermal expansion of the gas does 
work upon the load, whose amount is A:Tln2 which we call 1 
bit. (Adapted from Fig. 4 in Ref. Q.) 



this can been seen as the origin of the intersection be- 
tween thermodynamics and information theory: looking 
at the (binary) position of the molecule leads to its 'dual' 
interpretations, i.e., in terms of thermodynamics and in- 
formation theory. 

Naturally, the factor fcTln2 appears often in the fol- 
lowing discussions on thermodynamic work, so we will 
take it as a unit and call it '1 bit' when there is no risk 
of confusion [l07j |. This will be especially useful when we 
coordinate discussions of the information theoretic 'bit' 
with the thermodynamic work. 

The demon apparently violates the second law. As 
a result of the perfect conversion of heat Q into work 
W, the entropy of the heat bath has been reduced by 
Q/T = W/T = k In 2. According to the second law, there 
must be an entropy increase of at least the same amount 
somewhere to compensate this apparent decrease. Szi- 
lard attributed the source of the entropy increase to mea- 
surement. He wrote "The amount of entropy generated 
by the measurement may, of course, always be greater 
than this fundamental amount, but not smaller" [8[. He 
referred to k In 2 of entropy as a fundamental amount 
well before Shannon founded information theory in 1948 
[13, El- Although he regarded the demon's memory as 
an important element in analyzing his one-molecule en- 
gine, Szilard did not reveal the specific role of the mem- 
ory in terms of the second law. Nevertheless, his work is 
very important as it was the first to identify the explicit 
connection between information and physics. 



it into two parts of equal volumes. The demon measures 
the position of the molecule, either in the right or in the 
left side of the partition (Fig.[ljb)). The demon records 
this result of the measurement for the next step. Then, 
he connects a load of a certain mass to the partition on 
the side where the molecule is supposed to be in, accord- 
ing to his recorded result of the previous measurement 
(Fig- [He)). Keeping the chamber at a constant temper- 
ature T by a heat bath, the demon can let the gas do 
some work W by quasistatic isothermal expansion (the 
partition now works as a piston). The gas returns to its 
initial state, where it now occupies the whole volume V, 
when the partition reaches the end of the chamber. Dur- 
ing the expansion, heat Q is extracted from the heat bath 
and thus W — Q as it is an isothermal process. Hence, 
Szilard's engine completes a cycle after extracting heat 
Q and converting it to an equal amount of mechanical 
work. 

As the gas is expanded isothermally jl06| . the amount 
of extracted work W is kT J^ /2 V^dV = fcTln2. An 
immediate question here might be if it is appropriate 
to assume the one-molecule g normal ideal gas 

in discussing thermodynamic/statistical properties. To 
fill this conceptual gap, we can consider an ensemble of 
one-molecule gases. Then by taking averages over the 
ensemble we can calculate various quantities as if it is an 
ideal gas with a large number of molecules. In a sense, 



C. Temporary solutions to the paradox 

As Szilard did, many generations believed for decades 
that the paradox of Maxwell's demon could be solved by 
attributing the entropy increase to measurement. Note- 
worthy examples include those by Brillouin Ref. [12] and 
Gabor 13]. They considered light to measure the speed 
of the molecules and (mistakenly) assumed this to be 
the most general measurement setting. Inspired by the 
work of Demers who recognized in the 1940s that 
a high temperature lamp is necessary to illuminate the 
molecules so that the scattered light can be easily distin- 
guished from blackbody radiation, Brillouin showed that 
information acquisition via light signals is necessarily ac- 
companied by an entropy increase, which is sufficient to 
save the second law [l2[ • Interestingly, in his speculation, 
Brillouin linked the thermodynamic and the information 
entropies directly. Information entropy is a key function 
in the mathematical theory of information, which was 
founded by Shannon only a few years before Brillouin's 
work, and although its logical origin is very different from 
thermodynamics, Brillouin dealt with two entropies on 
the same footing by putting them in the same equation 
to link the gain of information with the decrease of phys- 
ical entropy. This led to the idea of negentropy, which 
is a quantity that behaves oppositely to the entropy jl08] 
[l5l . Il6j . The negentropy is usually defined as the differ- 
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ence between the maximum possible entropy of a system 
under a given condition and the entropy it actually has, 
i.e. N := S max - S. 

Brillouin distinguished two kinds of information, free 
and bound. Free information If is an abstract and math- 
ematical quantity, but not physical. Bound information 
lb is the amount of information that can be acquired 
by measurement on a given physical system. Thus, 
roughly speaking, free information is equivalent to (ab- 
stract) knowledge in our mind and bound information 
corresponds to the information we can get about a phys- 
ical system, which encodes the information to be sent 
or stored. Bound information is then subject to environ- 
mental perturbations during the transmission. When the 
information carrier is processed at the end of the channel, 
it is transformed into free information. In Brillouin's hy- 
pothesis, the gain in bound information by measurement 
is linked to changes in entropy in the physical system as 

a t ^post-mcas ^pre-meas 

/c(ln -fprc-mcas hi -Fpost-meas ) 
^prc-mcas ^post-mcas ^* Oi (1) 

where P pre -meas and P p0 st-meas denote the numbers of pos- 
sible states of the physical system before and after the 
measurement, and similarly ffqrc -mcas an d S pos t- m eas are 
the entropies of the system 109]. The conversion coeffi- 
cient between physical entropy and bound information is 
chosen to be Boltzmann's constant to make the two quan- 
tities comparable in the same units. Equation ([T]) means 
that gaining bound information decreases the physical 
entropy. This corresponds to the process (a) to (b) in 

Fig.ru 

As bound information is treated with the physical en- 
tropy of the system on the same basis, the second law 
needs to be expressed with bound information as well 
as the physical entropy. If no information on the phys- 
ical system is available initially, that is, if /' mtlal = 0, 
the final entropy of the system after obtaining (bound) 
information is Sf = Si — lb- The second law of ther- 
modynamics says that in a n iso lated system the physical 
entropy does not decrease [lld| : ASf > 0. By using the 
change in negentropy AiV := — AS, the second law may 
now be written as 

AS f = A(S l - I b ) = ASi - AI b = -ANi - AI b > 0, 

which means 

A(Ni + h) < 0. (2) 

Naturally, if there is no change in the information avail- 
able to us, that is, AIb — 0, Eq. j2]) is nothing but the 
standard inequality for entropy, AS > 0. However, in 
Eq. information is treated as part of the total en- 
tropy, and it states that the quantity (negentropy + in- 
formation) never increases. This is a new interpretation 
of the second law of thermodynamics, implied by Bril- 
louin's hypothesis. 



Following Brillouin's hypothesis, Lindblad compared 
the entropy decrease in the system with the information 
gain an observer can acquire [13] ■ He analyzed measure- 
ments of thermodynamic quantities in a fluctuating sys- 
tem and showed that the information gain by the ob- 
server is greater than or equal to the entropy reduction 
in the system. Hence, the total entropy never decreases, 
as expected. 

Brillouin's idea of dealing with information and phys- 
ical entropy on an equal basis has been widely accepted. 
All discussions below about the physical treatment of in- 
formation processing tacitly assume this interpretation, 
which presupposes the duality of entropy, i.e. both infor- 
mation theoretic and thermodynamic aspects. 



III. EXORCISM OF MAXWELL'S DEMON: 
ERASURE OF CLASSICAL INFORMATION 
ENCODED IN CLASSICAL STATES 

Although the exorcism of Maxwell's demon by at- 
tributing an entropy increase to the acquisition of infor- 
mation had been widely accepted by physicists for more 
than a decade, the demon turned out to have survived 
until Landauer and Bennett put an end to the demon's 
life by reconsidering the role of "memory" , which Szi- 
lard barely overlooked. Landauer examined the process 
of erasure of information, introducing a new concept of 
"logical irreversibility" [la ]. 

Indeed, O. Penrose independently discovered essen- 
tially the same result about information erasure as Lan- 
dauer's. Penrose argued, in his book published in 1970, 
Foundations of Statistical Mechanics, that the paradox 
of Maxwell's demon could be solved by considering the 
entropy increase due to memory erasure. This was even 
earlier than Bennett's 1982 analysis of the demon, how- 
ever, it was left virtually unnoticed by physicists. Pen- 
rose's treatment was rather abstract and it did not go as 
far as Bennett's work, which investigated the possibility 
of measurement with arbitrarily little entropy increase. 
Here we focus on the viewpoint by Landauer and Ben- 
nett. 

Since information processing must be carried out by a 
certain physical system, there should be a one-to-one cor- 
respondence between logical and physical states. Logical 
states may be described as an abstract set of variables 
on which some information processing can be performed. 
Then, a reversible logical process, which means an injec- 
tive (one-to-one) mapping for logical states, corresponds 
to a reversible physical process. By implicitly assuming 
a correspondence between logical and physical entropies, 
as Brillouin proposed, this implies that a reversible log- 
ical process can be realized physically by an isentropic 
process, i.e. an entropy-preserving process. 

However, a logically irreversible process is non- 
injective, i.e. many-to-one, mapping. Such a process 
does not have a unique inverse as there may be many 
possible original states for a single resulting state. The 
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key here is that memory erasure is a logically irreversible 
process because many possible states of memory should 
be set to a single fixed state after an erasing procedure. It 
is impossible to determine the state prior to erasure with- 
out the aid of further information, such as the particular 
task of a computer programme or knowledge about the 
states of other memory registers that are correlated to 
the memory in question. This certain fixed state after 
erasure is analogous to a "white" or "blank" sheet of pa- 
per, on which no information is recorded. After erasing 
stored information, the state of memory should be in one 
specific state, in order not to carry any information (by 
definition of erasure). We will refer to the specific state 
after erasure as a standard state. 

In terms of physical states, a logically irreversible pro- 
cess reduces the degrees of freedom of the system, which 
implies a decrease in entropy. In order for this process to 
be physically legitimate, the energy must be dissipated 
into the environment. Landauer then perceived that logi- 
cal irreversibility must involve dissipation, hence erasing 
information in memory entails entropy increase (in the 
environment). This point will be the final sword to ex- 
orcize Maxwell's demon and is referred to as Landauer's 
erasure principle. 

Another important observation regarding the physics 
of information was given by Bennett |19j. He illustrated 
that measurement can be carried out reversibly, i.e. with- 
out any change in entropy, provided the measuring ap- 
paratus is initially in a standard state, so that recording 
information in the memory does not involve the erasure 
of information previously stored in the same memory. 
The rough reason for this is that measurement can be re- 
garded as a process that correlates the memory with the 
system (in other words, a process that copies the memory 
state to another system in a standard state), which can 
be achieved reversibly, at least in principle. 

Bennett exemplified the reversible correlating process 
by a one bit memory consisting of an ellipsoidal piece 
of ferromagnetic material. The ferromagnetic piece is 
small enough so that it consists of only a single domain 
of magnetization. The direction of the magnetization 
represents the state of the memory. Suppose that there is 
a double well potential with respect to the direction of the 
magnetization in the absence of an external field: Parallel 
and antiparallel to the major axis of the ellipsoid are the 
most stable directions. The central peak between the two 
wells is considerably higher than kT in the absence of an 
external magnetic field, so that thermal fluctuations do 
not allow the state to climb over the peak. Figure[2]shows 
a sketch of the potential that was illustrated in Ref. . 

Two minima of the potential represent the state of 
memory, either "0" or "1", and the blank memory is 
assumed to be in one standard state, e.g. "0" , before in- 
formation is copied onto it from another memory. Let us 
consider the process that correlates the state of a blank 
memory B with that of a memory A, which is the sub- 
ject of measurement. This can be achieved by manipu- 
lating the shape of the potential for the blank memory 



energy 




direction of magnetisation 

0®0©0 




Intensity of the 
transverse field 



FIG. 2: A potential energy for a binary memory whose state 
is represented by the direction of the magnetization. When 
there is no external transverse magnetic field the memory is 
stable in one of the potential wells, which correspond to "0" 
and "1" of recorded information. The transverse field lowers 
the height of the barrier at the centre. At a certain point, the 
profile of the potential becomes bath-tub-like shape with a 
flat bottom, where the direction of the magnetization is very 
sensitive to the longitudinal field component. (Adapted from 
Ref. [3.) 



as follows. By applying a transverse external magnetic 
field, the peak of the central barrier becomes lower. At 
a certain intensity of the field, there will be only a single 
bath-tub-like flat bottom, i.e. the state of B becomes 
very sensitive to a weak longitudinal component of the 
field. The memory A is located so that its magnetiza- 
tion can cause a faint longitudinal field at the position of 
the memory B. Then because of B's sensitivity to such 
a field, the state of A can be copied to the memory B 
with arbitrarily small (but nonzero) energy consumption. 
Removing the transverse field completes the correlating 
process. The crux of the physics here is that this process 
can be reversed by using the perturbation from another 
reference memory, which is in the standard state. 

Now let us focus on the erasure of information. Since 
measurement can be done virtually without energy con- 
sumption, it is the dissipation due to the erasure pro- 
cess that compensates the entropy decrease induced by 
Maxwell's demon in Szilard's model. The physical sys- 
tem for the demon's memory can be modelled as a one- 
molecule gas in a chamber of volume V, which is divided 
into two parts, the left "L" and the right "i?" , by a par- 
tition. The demon memorizes the measurement result by 
setting the position of the molecule in this box. If the 
molecule in Szilard's engine may be found in the left and 
the right sides with equal probability, i.e. 1/2, then the 
minimum amount of work that needs to be invested and 
dissipated into the environment is fcTln2. 

The actual process is as follows. The molecule is in 
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FIG. 3: Thermodynamic process to erase information. A bi- 
nary information is stored in a vessel as the position of the 
molecule, either L or R. A common procedure for both initial 
states, i.e. removing the partition and halving the whole vol- 
ume by an isothermal compression towards the standard state 
L, completes the erasure. (Adapted from Fig. 3 in Ref. [9|.) 



either L or R, depending on the information it stores 
(Fig. [3(a)). To erase the stored information, first, we 
remove the partition dividing the vessel at the centre 
(Fig. [3(b)). Second, insert a piston at the right end 
(Fig. [3(c)), when the standard memory state is L, and 
push it towards the left isothermally at temperature T 
until the compressed volume becomes V/2 (Fig. [3(d)). 
The resulting state is L for both initial states and the 
information is erased. It is worth noting that the eras- 
ing process should not depend on the initial state of the 
memory. The "i?" state in Fig. [3(a) may be transferred 
to "L" state by simply moving the region of volume V/2 
to the left. However, in this case, the operator of the pis- 
ton needs to observe the position of the molecule and this 
action requires another memory. Thus, the erasure pro- 
cess should be independent of the initial memory state. 
The work invested to compress the volume from V to V/2 
is Werasure = kT In 2 and this is dissipated as heat into 
the environment, increasing its entropy by fcln2, as Lan- 
dauer argued. As there is no wasted work (in the sense 
that all invested work is converted into heat to increase 
the entropy of the environment), fcTln2 is the minimum 
amount of work to be consumed for erasure. 

If there is a biased tendency in the frequency of ap- 
pearance of a particular memory state, say L, how much 
would the erasure work be? The answer is simple: the 
erasure work is proportional to the amount of informa- 
tion stored, thus Werasure = kT In 2H(j>), where p is the 
probability for the molecule to be in the L state and 

H{p) = -plogp- (l-p)log(l-p) (3) 



FIG. 4: Erasure process for an unbalanced probability distri- 
bution. The only difference from the case of balanced distri- 
bution (Fig. [3J is the expansion from (a) to (a'), which gives 
us H(p) bits of work. 



is the (binary) Shannon entropy. Throughout this article, 
log denotes logarithms of base 2. The reason can be ex- 
plained by a process depicted in Fig. 2] The unbalanced 
tendency between L and R is expressed by the numbers of 
molecules in the L and the R regions. As we consider only 
an ideal gas (with no interactions between molecules), 
this scenario does not change the discussion at all if we 
average the erasure work at the end. Since removing the 
partition at the beginning allows the gas an undesired 
irreversible adiabatic expansion/compression, we first let 
the gases in both parts expand/contract isothermally by 
making the partition movable without friction (Fig. H(a) 
to (a')). During this process, the gases exert work to- 
wards the outside. Letting pl, pr and Vl denote the 
pressure in the left region, that in the right region, and 
the volume of the region on the left of the partition, re- 
spectively, we can write the work done by gases as 

rpV 

W = / {pl-PrW l 

JV/2 

= NkT r (jL_l^l\ dVL 
Jv/2\Vl v-v l J 

= NkT{\n2+p\np+(l-p)\n(l-p)) 

= NkT\n2(l - H{p)). (4) 

As the pressures in the left and the right are equal, this 
is the same situation as in Fig. [3(a). Hence, at least 
NkT In 2 of work needs to be consumed to set the mem- 
ory to the standard state (Fig. [3(c) to (d)). As a whole, 
we invested Werasure = kT In 2 - W = kT\n2H(p) of 
work per molecule. 

Maxwell's demon is now exorcized. The entropy de- 
crease, or the equivalent work the demon could give us, 
should be completely consumed to make his memory 
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state come back to its initial state. The state of the 
whole system, consisting of the heat engine and the de- 
mon, is restored after completing a thermodynamic cycle, 
without violating the second law. 

IV. OTHER 'DERIVATIONS' OF THE 
ERASURE ENTROPY 



The key idea in her results is to make use of a quan- 
tity r, which was introduced by Jarzynski in the context 
of nonequilibrium thermodynamic processes [22]. In a 
classical setting, T is defined by 

r(c°,c T ) - -Hp f (x\pT)} + H Pl (x n ,p )} 

+l3AE(x°,p°,x^,pl), (8) 



We have focused on the one-molecule gas model so far, 
however, Landauer's erasure principle holds regardless of 
specific physical models. In order to see its generality 
with some concrete examples, we now briefly review two 
particularly interesting papers, one by Shizume and 
the other by Piechocinska [2lj . 

Shizume used a model of memory whose state was 
represented by a particle having Brownian motion in 
a time-dependent double well potential. Assuming the 
random force Fji(t) to be white and Gaussian satisfy- 
ing (F R (tx)F R (t 2 )) = 2m 1 T5(t 1 - t 2 ), the motion of the 
distribution function f(x,u,t) of the particle in the po- 
sition (x) and velocity (u) space can be described by the 
Fokker-Planck equation. Shizume then compared Q and 
TdS/dt, i.e. the ensemble average of the energy given 
to the particle from the environment per unit time, and 
the change in the entropy of the whole system per unit 
time multiplied by the temperature. The entropy S is 
the Shannon entropy of continuous distribution, defined 
by 

/>oo 

S := —k I dxduf(x,u,t) In f(x,u,t). (5) 

J oo 

With the help of the Fokker-Planck equation concerning 
f(x,u,t), one arrives at the relation, 



< T 



dS_ 



(6) 



from which we obtain the lower bound of the energy dis- 
sipated into the environment between times ti and tf as 



where ( = (x,p,xt,Pt) is a set of positions and mo- 
menta of the degrees of freedom that describe the (mem- 
ory) system and the heat bath (T), respectively. The 
superscripts, and r, are the initial and final times of 
the erasure process, pi and pf are the distribution func- 
tions of the particle representing the 'bit' in a double well 
potential, and are assumed to be in the canonical distri- 
bution: erasing information is expressed by the form of 
Pf so that it takes nonzero values only in one of the two 
regions, i.e. either of those for '0' and '1', which corre- 
sponds to the 'L' state in Fig. [31 AE is the change in the 
internal energy of the heat bath and (3 — (kT)^ 1 . 

The entropy increase due to erasure can be obtained 
by first calculating the statistical average over all possible 
trajectories (. We then have (e~ r ) = 1, which in turn 
implies — (r) < by the convexity of the exponential 
function. Substituting the expressions for pi and pf (the 
canonical distributions) to T leads to an inequality 



In 2 < /3 (AE). 



(9) 



As AE is the change in the internal energy of the heat 
bath, it includes the heat dissipated into the bath as well. 
Thus the conservation of energy can be written as W = 
AE + AE systcm , where W is the work done on the system 
and the heat bath, and A_E sys t G m is the change in the 
internal energy of the system. Due to the symmetry of pi 
and pf, AE systcm vanishes when averaged over, therefore 
we now have 



AQout(*i)*/) = J (-Q)dt>T[S(fi)-S(t f )]. (7) 

Equation ([7]) gives us the lower bound of the heat gen- 
eration due to the process that erases H(p) bits of in- 
formation. As we expect, the lower bound is equal to 
kT\n2H(p). Clearly, this derivation does not use the 
second law. 

It is clear in Shizume's derivation that the entropy in- 
crease due to the erasure is independent of the second 
law. Hence it is immune to a common criticism against 
the erasure principle that it is trivially the same as the 
second law because the second law is used in its deriva- 
tion. However, the above description assumes only a spe- 
cific physical model and thus a more general model might 
be desirable. This was done by Piechocinska, who ana- 
lyzed the information erasure in a quantum setting as 
well as in classical settings [2lj . 



fcTln2 < (W), 



(10) 



which is equivalent to Landauer's erasure principle. 

Piechocinska applied a similar argument to the quan- 
tum case, i.e. the erasure of classical information stored 
in quantum states. The state of the bit can be reset 
after some interaction with the heat bath, which is ini- 
tially in thermal equilibrium. By assuming that the bath 
decoheres into one of its energy eigenstates due to the 
interaction with an external environment, which may be 
much larger than the bath, we can deal with the heat dis- 
sipation (into the bath) quantitatively. Then, the mini- 
mum work consumption can be found to be kT In 2 after 
computing a quantity corresponding to T in Eq. |8"]), A 
related, but more general, arg ument in a similar spirit 
has been also presented in [231 ]. 
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(b) 



(c) 



FIG. 5: The one-molecule heat engine considered by Gabor 
to show that the light needs to behave like wave. When the 
molecule comes into the illuminated region, a piston is auto- 
matically inserted and the 'gas' expands isothermally to ex- 
tract work from the heat bath. This process could in principle 
be repeated infinitely, converting infinite amount of heat into 
mechanical work, if the light had only a particle-like nature. 
(Adapted from Fig. 7 in Ref. Q3].) 



V. SOME INTERESTING IMPLICATIONS OF 
THE SECOND LAW 

A. Wave nature of light from the second law 



However, we now know that this interpretation is 
wrong. Even if the light behaves like particles, Gabor's 
engine does not violate the second law. The solution to 
this apparent paradox is also the erasure principle. 

By detecting the molecule and extracting work subse- 
quently, the whole system stores H(p) bits of informa- 
tion, where p is the probability of finding the molecule in 
the illuminated region. Let us assume for simplicity that 
the sampling frequency is low enough, compared with 
the time duration necessary f or th e molecule to travel 
through the illuminated region 1 1 lj . Then, p can also be 
interpreted as the ratio between the volume of the illumi- 
nated region and the whole volume of the chamber. This 
information is about the occurrence of the work extrac- 
tion and is stored in the mechanism that resets the posi- 
tion of the piston after the extraction. The piston forgets 
the previous action, but the resetting mechanism does 
not. Thus the whole process is not totally cyclic, though 
it should be so to work as a perpetuum mobile. Because 
Gabor's engine is activated with probability p, the en- 
gine stores H(p) bits on average. While kT\n2H(p) bits 
of work are needed to erase this information to make 
the whole process cyclic, we gain only —kTplnp of work, 
which is smaller than the erasure work, from this process, 
hence there is no violation of the second law. 



B. Gibbs paradox and quantum superposition 
principle 



Let us make a detour to another interesting and well- 
elaborated implication of the second law, which was 
argued by Gabor. He studied Brillouin's analysis of 
Maxwell's demon about detecting a molecule by light sig- 
nals further [l3[. Gabor considered a one- molecule heat 
engine, a part of which is illuminated by an incandescent 
light to detect the molecule wandering into this region 
(Fig. [5J. The detection of the molecule can be done by 
photosensitive elements that are placed around the light 
path, so that any scattered weak light will hit one (or 
more) of them. As soon as the molecule is found by de- 
tecting the scattered light, a piston is inserted at the edge 
of the illuminated region. Then by isothermal expansion 
the gas exerts mechanical work. The same process is re- 
peated when the molecule wanders into the illuminated 
region again. This is a perpetuum mobile of the second 
kind as it continues to convert heat from a heat bath to 
mechanical work. Gabor found that the second law is 
vulnerable if the light intensity can be concentrated in a 
well defined region and made arbitrarily large compared 
with the background blackbody radiation. He then de- 
duced that this is impossible because light behaves both 
as waves and as a flux of particles. In other words, ac- 
cording to Gabor's argument, the second law implies the 
wave nature of light. This is an interesting implication 
of the second law in its own right, as there seems to be 
no direct link between thermodynamics and the nature 
of light. 



Suppose a gas chamber of volume V that is divided 
into two half regions by a removable partition. Each 
half region is filled with a dilute ideal gas at the same 
pressure P. We now consider the entropy increase that 
occurs when we let gases expand into the whole volume V 
by removing the partition. If the gas in one region (e.g., 
the left side) is different from the gas in the other (the 
right side), then the entropy increase due to the mixing is 
kN\n2, where N is the total number of molecules in the 
chamber. On the other hand, if the gases in two regions 
are identical, no thermodynamic change occurs and thus 
the entropy is kept constant. This discontinuous gap of 
the entropy increase with respect to the similarity of two 
gases is called the 'Gibbs paradox'. Lande dealt with 
this problem and 'derived' the wave nature of physical 
state as well as the superposition principle of quantum 
mechanics 24]. This is not only an interesting work in 
the sense that it attempted to link thermodynamics and 
quantum mechanics, but useful to introduce the idea of 
semipermeable membranes that we will use as a tool in 
later sections. Although there are a number of papers 
on the Gibbs paradox other than Lande's, we feel that 
going into these is out of the scope of this brief review. 
Interested readers may refer to, for example, Refs. (25l . 

For convenience of the following discussion, we use the 
extractable work, i.e. the Helmholtz free energy, as it 
is equal to the entropy change (times temperature) in 
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isothermal processes. The semipermeable membranes we 
introduce here are a sort of filters that distinguish the 
property of gases, i.e. the nature of molecules, and let 
one (or more) particular property of gas go through it. 
In other words, a semipermeable membrane is transpar- 
ent to one type of gas, but totally opaque to other types 
of gases: Each membrane thus can be characterized with 
the property of the gas it lets go through. These are es- 
sentially the same as what von Neumann considered in 
his discussion to define the entropy of a quantum state, or 
von Neumann entropy [l!2| , and they were scrutinized by 
Peres and were shown to be legitimate quantum mechan- 
ically [29l [30j| - In Lande's argument, however, quantum 
mechanics is not assumed from the outset. 

Lande postulated the continuity of the entropy change 
in reality. To bridge the gap between the 'same' and 'dif- 
ferent' gases, he introduced a fractional likeness, which 
is quantified as q(Ai,Bj), between two states Ai and 
Bj. Here A (or B) represents a certain 'property' or 
'observable' and the indices are values of A (B) with 
which we can distinguish them completely. For simplic- 
ity, we assume all observables can take only discrete val- 
ues when measured. Two states are completely different 
when q = and they are identical when q = 1, and 
different values of the same property are perfectly distin- 
guishable by some physical means, thus q(Ai,Aj) = 5jj. 
Now suppose that a semipermeable membrane that is 
opaque to Ai but transparent to Aj(J ^ i), to which we 
will refer as the membrane Mj^ , is placed in a gas whose 
property is B k . Then the membrane will reflect a fraction 
q = q(Ai,Bk) of the gas and pass the remaining fraction 
1 — q as a result of the fractional likeness between Ai and 
Bfc. Another consequence of the membrane is that the 
molecules that are reflected by need to change their 
property from B k to Ai and similarly t he ot her molecules 
become Aj with probability q(Aj,B k ) [l!3j |. in order not 
to change the state of molecules by a subsequent appli- 
cation of another Ma { - In what follows, we will identify 
the term 'property' with 'state', although it still does not 
necessarily mean a quantum state. 

This solution to the Gibbs paradox - the introduction 
of fractional likeness between states - leads, according 
to Lande, to the wave-function-like description of state. 
A rough sketch of his idea is as follows. First, we write 
down the transition probabilities between different states 
in a matrix form 

' q{A u B 1 ) q{A u B 2 ) 
q(A 2 ,B 1 ) q(A 2 ,B 2 ) ••• . (11) 

Naturally, the sum of each row or column is always unity 
because a state must take one of the possible values in 
any measured property. Similar matrices should be ob- 
tained for q(B,C), q(A,C), etc., and we expect a math- 
ematical relation between these matrices, for instance, 
? 

such as q(Ai,C k ) = Y,j<l( A u B j)<l( B 3> c k)- A consis- 
tent mathematical expression can be obtained by con- 
sidering a matrix ip(A, B) whose (i,j)-th elements are 



V/2 V/2 

EES 
i 

V/{l + q) qV/(l+q) 

OD 

FIG. 6: A possible configuration to confirm the continuity of 
the extractable work. The left (right) hand side of a chamber 
is filled with an A\ (_Bi)-gas. Two membranes that distin- 
guish Ai and Ai are used to extract work. The membrane 
on the left lets the A\ gas pass it through freely, but reflects 
the A2 gas completely. The other membrane works in the 
opposite manner. Since the B\ gas changes its state into A\ 
with probability p when measured by the membrane, the right 
membrane does not reach the right end of the chamber by a 
(quasi-static) isothermal expansion. 



given as y/q(Ai, Bj)e lip with arbitrary phase tp, i.e. a 
matrix whose rows and columns can be regarded as a 
vector of unit norm, thus tp for different pairs of proper- 
ties are connected with an orthogonal (unitary) transfor- 
mation. The arbitrariness for the phase is restricted by 
the condition i/)(Ai,Aj) = J^k ip(A l , B k )ip(B k , Aj) = 8 iv 
Identifying ip(Ai,Bj) = -Jqe lv with a complex proba- 
bility amplitude for the transition from Bj to Ai in- 
duced by the membranes, we see a superposition rule 
ip(Ai, C k ) = J2j iPi A i, B 3 )tp{B h C k ). Then Lande claims 
that "the introduction of complex probability amplitudes 
ip subject to the superposition rule is inseparably linked 
to the admission of fractional likenesses q." 

To confirm the continuity of the extractable work (or 
the entropy increase) due to the mixing of two gases, 
let us look at a chamber, a half of which is filled with 
dilute gas Ai and the other half with B\ as in Fig. [5] 
The number of gas molecules is N/2 each. Let us also 
assume that both A and B are a two-valued property. 
If we use two membranes that distinguish the state A\ 
and A 2 , the work by gases will be smaller than NkT In 2 
because a fraction of B\ becomes A\ with a certain 
probability q. The work done by the gases is given as 
W = (iVfcT/2)[21n2 + qln q - (1 + q) ln(l + q)} and W 
decreases smoothly from iVfcTln2 (when q = 0, for per- 
fectly distinct gases) to (when q = 1, identical gases), 
therefore no discontinuous entropy change. Note that 
this choice of membranes is not optimal to maximize the 
amount of extractable work and we will look at this pro- 
cess m more detail in Section IVlTBl 
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Quantum state discrimination and the second 
law 



As we have seen in the previous section, in his at- 
tempt to solve the Gibbs paradox, Lande deduced that 
a thermodynamic speculation in the form of the continu- 
ity principle could lead to the partial likeness (or distin- 
guishability) of states as a result of the wave nature of 
particles. On the other hand, starting from the distin- 
guishability issue of quantum states, Peres showed that 
if it was possible to distinguish non-orthogonal quantum 
states perfectly then the second law of thermodynamics 
would necessarily be violated [29|, [3(| • 

As a background, let us consider an elementary work- 
extraction process using a collection of pure orthogonal 
states. As shown in Fig. [Jj a chamber of volume V is 
partitioned by a wall into two parts, one of which has 
a volume p\V and the other has P2V, where pi + P2 — 
1. The vessel is filled with a gas of molecules whose 
(quantum) internal degree of freedom is represented by, 
for example, a spin. Here it suffices to consider a gas of 
spin-1/2 molecules, e.g. a gas with spin up, i.e. | j), in 
the left region and a spin down gas, | J.), in the right. 

Now we reintroduce semipermeable membranes, Mf 
and Mi, that distinguish the two orthogonal spins, | f) 
and ||). These are essentially the same as what we have 
seen in Section iy Bl to consider the 'fractional likeness' of 
states. The membrane Mf is completely transparent to 
the I J.) gas and completely opaque to the | |) gas. The 
other membrane Mi has the opposite property. 

Suppose that the partition separating two gases is re- 
placed by the membranes, so that Mf and face | f) 
and I |), respectively, as in Fig. [7J Then, as in Sec- 
tion IV Bi gases give us some work, expanding isother- 
mally by contact with a heat bath of temperature T. 
The total work extractable can then be computed as 
W = -pilogpi -p 2 logp 2 = H(pi), where Hfa) = 
— pi log pi is the Shannon entropy of a probability dis- 
tribution {pi}. 

Let us look at Peres's process. The physical system 
we consider now is almost the same as the one in Fig. [7J 
however, we now have two non-orthogonal states. Al- 
though gases consist of photons of different polarizations 
in the original example by Peres, we consider spin-1/2 
molecules to avoid the argument of the particle nature 
of light. The volume of the chamber here is 2V, and 
in the initial state, the gas of volume V is divided into 
two equal volumes V/2 and separated by an impenetra- 
ble wall (See Fig. Ufa)). The gas molecules in the left 
side have a spin up ||), and those in the right side have 
a spin I — ►) = (| t) + I |))/v / 2- Both parts have the same 
number of molecules, N/2, thus the same pressure. The 
first step is to let gases expand isothermally at tempera- 
ture T so that the entire chamber will now be occupied 
by them (Fig.^b)). During this expansion, gases exert 1 
bit of work (= 7VfcTln2) towards the outside, absorbing 
the same amount of heat from the heat bath. 

In the second step, we introduce fictitious "magic" 
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FIG. 7: The work-extracting process with semipermeable 
membranes. In the initial state (a), the vessel is divided into 
two parts by an impenetrable opaque partition. The left side 
of the vessel, whose volume is piV, is occupied by the | f)- 
gas, and the right side is filled with | J.) -gas. By replacing the 
partition with two semipermeable membranes, Mf and Mi, 
we can extract H (pi) = — pi logpi — P2 logp2 bits of work by 
isothermal expansion. The membranes reach the end of the 
vessel in the final state (b). 



membranes that can distinguish non-orthogonal states. 
We replace the partition at the centre with these mem- 
branes and insert an impenetrable piston at the right end 
of the vessel. The membrane Ml , which is transparent to 
the I — >)-gas but opaque to | T)-gas, is fixed at the centre, 
while the other one M'_> , which has the opposite property 
to ML can move in the area on the left. Then, the piston 
inserted at the right end and ML, is pushed towards the 
left at the same speed so that the volume and the pressure 
of the I — >)-gas in between the piston and the membrane 
ML, will be kept constant (from Fig. [SJb) to (c)). Be- 
cause of the property of the membranes, this step can be 
achieved without friction/resistance, thus needs no work 
consumption or heat transfer. 

The gas in the vessel in Fig. [5Jc) is a mixture of two 
spin states. The density matrix for this mixture is 



p=\\w\ + \\ 



►><-i = \ 



3 1 
1 1 



(12) 



in the {| f), | |)}-basis. The eigenvalues of p are (1 + 
\/2/2)/2 = 0.854 and (1 - V2/2)/2 = 0.146 with cor- 
responding eigenvectors | /) — cos||0) + sin||l) and 
\/) = cos (-^) |0) + sin (-22) |1>, respectively. 

Now let us replace the "magic" membranes by ordinary 
ones, which discriminate two orthogonal states, and 
I /). The reverse process of (b)— >(c) with these ordinary 
membranes separates | /*) and | /) to reach the state (d) . 
Then, after replacing the semipermeable membranes by 
an impenetrable wall, we compress the gases on the left 
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FIG. 8: A thermodynamic cycle given by Peres to show that 
distinguishing non-orthogonal quantum states leads to a vio- 
lation of the second law. The arrows indicate the directions 
of spin in the Bloch sphere. The use of hypothetical semiper- 
meable membranes, which distinguish non-orthogonal states 
1 1") and | — ►) perfectly, in the step from (b) to (c) is the key to 
violate the second law. (Modified from Fig. 9.2 in Ref. [3Cj].) 



and the right parts isothermally until the total volume 
and the pressure of the gases become equal to the initial 
ones, i.e. those in state (a). This compression requires a 
work investment of -(0.854 log 0.854 + 0.146 log 0.146) 
0. ()!)() bits, which is dissipated into the heat bath. In order 
to return to the initial state (a) from (e), we rotate the 
direction of spins so that the left half of the gas becomes 
1 1) and the right half becomes | — >). More specifically, we 
insert an opaque wall to the vessel to halve the volume V 
occupied by gases (the border between regions labelled 
A and B in Fig. UJe)). Rotations | /) -> | f) in the 
region A, | /) -> | ->) in B and | /) ->• | ->) in C, 
and a trivial spatial shift restore the initial state (a). 
As rotations here are unitary transformations, thus an 
isentropic process, any energy that has to be supplied 
can be reversibly recaptured. Alternatively, we can put 
the system in an environment such that all these spin 
pure states are degenerate energy eigenstates. Hence, we 
do not have to consider the work expenditure in principle 
when the process is isentropic. 



Throughout the process depicted above and in Fig. [8j 
the net work gained is 1 — 0.600 = 0.400 bits. Therefore, 
Peres's process can complete a cycle that can withdraw 
heat from a heat bath and convert it into mechanical 
work without leaving any other effect in the environment. 
This implies that the second law sets a barrier to quan- 
tum state discrimination. 



D. Linearity in quantum dynamics 

Peres also showed that the second law should be vio- 
lated if we admit nonlinear (time) evolution of the quan- 
tum states [31]. His proof is concise and is summarized 
below. 

Let the state p be a mixture of two pure states, 
p = p\4>){<t>\ + (1 - with < p < 1. By 

rewriting one of the state vectors, say as \ip) = 
^fJ\4>) + a/T -- J\(j> ), where \4> ± ) is a vector orthogonal 
to \(f>) and / = K^li/))! 2 , p can be written in a matrix 
form (in the 2-dimensional subspace that supports p) as 



P : 



vTTwXi-p) (i-p)(i-/) 



(13) 



The von Neumann entropy can be computed as S(p) = 
— A+ logA+ — A_ log A^, where X± are the eigenvalues of 
p, i.e. 



\± = \±(\- P {l-p){l-f) 



(14) 



We can see dS/df < for all p. Therefore, in order not 
to make the entropy decrease in time, the change of / 
must be non-positive: \{(j){t)\ip(t))\ 2 < | <^>(0) | V (0)> | 2 - 

Now let {|</>fc)} be a complete orthogonal set spanning 
the whole Hilbert space. Then, for any pure state 
J2k K^fclV')! 2 = 1- Thus, if there is some m for which 
\(<j) m (t)\^(t))\ 2 < \((j}(0)\^(0))\ 2 , there must be some n 
for which \(c/) n (t)\tp(t))\ 2 > |(</>(0)|V>(0))| 2 , which means 
that the entropy of a mixture of |^> n )(<^n| and wln 
decrease in a closed system. Hence, / = |(</>|'0)| 2 needs 
to be constant for any \cf>) and \tp) to comply with the 
second law. There are still two possibilities for the time 
evolution of states \ip(0)) — > \ip(t)) to keep / constant, 
namely unitary and antiunitary evolutions, according to 
Wigner's theorem [Hj]. Nevertheless, the latter possibil- 
ity can be excluded due to the continuity requirement. 
Therefore, the evolution of quantum states is unitary, 
which is linear. 



E. Second law and general relativity 

The second law of thermodynamics gives an interesting 
implication not only in quantum mechanics, but also in 
the theory of gravity, through the impossibility of the 
second kind of perpetuum mobile. This is illustrated by 
Bondi's thought experiment. Although an assumption 
in the idea presented below seems quite infeasible, let us 
have a look at it because it is a nice heuristic introduction 
to discuss 'real' physics later. Imagine a vertically placed 
conveyor belt that has a number of single-atom holders 
on it as in Fig.[9j Let us assume that the atoms on the left 
side are in an excited state and those on the right side are 
in a lower energy state. When an excited atom reaches 
the bottom of the belt, it emits a photon, lowering its 
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FIG. 9: Bondi's thought experiment for a perpetuum mobile. 
Excited atoms (red balls) emit a photon at the bottom of 
the belt to lower its energy level (blue balls). The emitted 
photon is reflected by the curved mirrors placed so that it 
will be absorbed by an atom in the lower energy level at the 
top of the belt. 



energy level, and the emitted photon will be reflected by 
the curved mirrors to be directed to the atom at the top 
of the belt. Then this atom at the top will be excited, 
absorbing the photon. 

As energy is equivalent to mass, according to special 
relativity, the atoms on the right side are always heavier 
than those on the left as far as the emission and ab- 
sorption of photon work as described above. That is, 
the gravitational force will keep the belt rotating for- 
ever. In this scenario, however, there is another assump- 
tion, which seems implausible, that atoms emit/absorb 
photons only at the bottom/top of the belt. Such an 
assumption makes this device unlike to work. Neverthe- 
less, Bondi's perpetuum mobile is not compatible with 
the physical laws for the following reason. 

What prevents this machine from perpetual motion is 
actually the distorted spacetime, i.e. a change of the met- 
ric, which is seen as gravity by us. The theory of general 
relativity tells that the space is more stretched if one goes 
farther from the 'horizon': the length of the geodesic line 
is longer near the horizon for a given length in the normal 
sense, which is defined as the circumference of the sphere 
around the massive object divided by 27r. Because light 
travels along a geodesic, stationary observers see that the 
wavelength becomes longer when light leaves away from 
the object: the light becomes red-shifted. 

An experiment to confirm this red-shift was carried 
out in 1960 by Pound and Rebka [33[. They made use 
of the Mossbauer effect of nuclear resonance, which can 
be used to detect extremely small changes in frequency. 
Their results demonstrated that the photons do change 
the frequency by a few parts of 10 15 when they travel for 
22.5m vertically, which agreed Einstein's prediction with 
a high accuracy (only one percent error in the end [34|). 
The existence of the gravitational red-shift directly rules 
out Bondi's perpetuum mobile. 



F. Einstein equation from thermodynamics 

Einstein's equation, which describes the effect of 
energy-mass on the geometrical structure of the four di- 
mensional spacetime, can be derived from a fundamental 
thermodynamic relation. In thermodynamics, knowing 
the entropy of a system as a function of energy and vol- 
ume is enough to get the equation of state from the funda- 
mental relation, SQ — TdS. Jacobson tried to obtain the 
field equation as an equation of state, starting from ther- 
modynamic properties of black holes [351 ]. It had been 
known by then that there was a strong analogy between 
the laws of black hole mechanics and thermodynamics 
(3^ | . That is, the horizon area of a black hole does not 
decrease with time, just as the entropy in thermodynam- 
ics. Bekenstein then argued that the black hole entropy 
should be proportional to its horizon area after introduc- 
ing the entropy as a measure of information about the 
black hole interior which is inaccessible to an exterior 
observer [Hi!]. 

Bekenstein's idea suggests that it is natural to regard 
the (causal) horizon as a diathermic wall that prevents 
an observer from obtaining information about the other 
side of it. On the other hand, a uniformly accelerated ob- 
server sees a black body radiation of temperature T from 
vacuum (the Unruh effect) [3!||4(|. The origin of the Un- 
ruh effect lies in the quantum fluctuation of the vacuum, 
which is also the origin of the entropy of the horizon, i.e. 
the correlation between both sides of the horizon. Thus, 
in order to start from the above relation, SQ = TdS, Ja- 
cobson associated SQ and T with the energy flow across 
the causal horizon and the Unruh temperature seen by 
the observer inside the horizon, respectively. Then, Ein- 
stein's field equation can be obtained by expressing the 
energy flow in terms of the energy-momentum tensor T^ v 
and the (horizon) area variation in terms of the 'expan- 
sion' of the horizon generators. Another essential element 
in the field equation, the Ricci tensor i? M „, appears in the 
form of the expansion through the Raychaudhuri equa- 
tion (for example, see Ref. 4l|), which describes the rate 
of the volume (area) change of an object in a Rieman- 
nian manifold. The resulting equation is thus (by setting 
c=l) 



1 



-Rguv + kguu 



2nk 



(15) 



where rj is the proportionality constant between the en- 
tropy and horizon area, viz. dS = rjdA, and R and A 
are the scalar curvature and the cosmological constant. 
Comparing with the standard expression of the equation, 
in which the coefficient for T^ v is 8ttG, we identify rj to 
be k/(AhG) = k/{Al 2 p ) with the Planck length l P , which 
agrees with the derivation of rj in Ref. (42J. 

Einstein's field equation can indeed be seen as a ther- 
modynamic equation of state. An important assump- 
tion for the above derivation is, however, the existence 
of a local equilibrium condition, for which the relation 
SQ = TdS is valid. This means that it would not be 
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appropriate to quantize the field equation as it is not ap- 
propriate to quantize the wave equation for sound propa- 
gation in air. Further, Jacobson speculated that the Ein- 
stein equation might not describe the gravitational field 
with sufficiently high frequency or large amplitude distur- 
bances, because the local equilibrium conditions would 
break down in such situations as in the case of sound 
waves. 

Bekenstein's conjecture about the black hole entropy is 
now widely accepted as a real physical property, partic- 
ularly after the discovery of the Hawking radiation [42j 
that showed that black holes do radiate particles in a 
thermal distribution at finite temperature. The thermo- 
dynamics of black holes is still an extensive and active 
field, whose famous problems include the 'black hole in- 
formation paradox'. Covering these topics in detail go 
beyond the scope of this brief overview, thus here we 
simply list a few references (43l. lij. IZiH lia . 

It is now clear that this example illustrates a close con- 
nection between information, thermodynamics, and the 
general relativity, which might look unrelated with each 
other at first sight. This strongly re-suggests the dual- 
ity of entropy, which we have mentioned at the end of 
Section III C[ and the universality of the thermodynamic 
relations in generic physics. We shall attempt to explore 
this duality in the paradigm of quantum information the- 
ory later on in Section IVlI Bl 



VI. ERASURE OF CLASSICAL INFORMATION 
ENCODED IN QUANTUM STATES 

In classical information theory, an alphabet i G 
{1, n|, which appears with probability pi in a mes- 
sage [l 14l | generated by a source, is represented by one 
of the n different classical states. On the other hand, in 
quantum information theory [4!| , information is encoded 
in quantum states, so that each alphabet i is represented 
by one of the n different quantum states whose density 
matrices are denoted by pi. We will refer to each state 
carrying an alphabet as a message state. We will also 
call a set of quantum states used in a message, in which 
the state pi appears with probability pi , an ensemble of 
quantum states {pi,pi},i € {l,...,n}. 

A smart way to erase classical information encoded 
in quantum states was first considered by Lubkin 
who introduced erasure by thermal randomization, and 
by Vedral [5(3, Hl| in a more general setting. Thermal 
randomization makes use of the randomness of states in 
a heat bath that is in thermal equilibrium. If we put a 
message state in contact with a heat bath at temperature 
T, the state will approach thermal equilibrium with the 
heat bath. More precisely, a message state pi changes 
gradually after colliding (interacting) with the heat bath 
and sufficiently many collisions make the state to become 
indistinguishable with that of the heat bath. We assume 
that the bath's state as a whole will not change much 
since its size is very large. Due to the uncertainty stem- 




thermal ization 

FIG. 10: The erasure of classical information carried by quan- 
tum states. Each message state interacts with a heat bath at 
temperature T and reaches thermal equilibrium. The infor- 
mation originally encoded in a state is lost and all states end 
up in which is the thermal state at the temperature T. 



ming from thermal fluctuations, we irreversibly lose the 
information that was carried by the state pi. 

Because of the generic nature of this erasure process 
by thermalization, entropy of the whole system, consist- 
ing of the message state and the heat bath, necessar- 
ily increases. How much would this increase be? Let 
us first simplify the discussion by considering that each 
message state is a pure state as in Fig. [TU] jj, [5(| • Be- 
fore erasing, the whole message is an ensemble {pi, |</>i)}, 
thus its average state is described by a density operator 
p = J2iPi\4>i}(4>i\- The thermalization process brings all 
states \4>i) to the same state u>, which is in thermal equi- 
librium at temperature T. The density matrix u> is given 
by 

j 

where H = J2i e i l e i)( e «l i s the Hamiltonian of the mes- 
sage state with energy eigenstates |e<), Z = Tr(e~P H ) is 
the partition function, and j3 = (kT)^ 1 . 

The total entropy change AS eTasuT e is the sum of the 
entropy change of the message system and that of the 
heat bath: AS eTasnTe = AS sys + AStath- Since the mes- 
sage state before the erasure is pure and its state after 
the erasure is the same as the heat bath, the m inim um 
entropy change in the message state is given by |115| 

AS^ = k\n2S(u), (17) 

where S(oj) = — Tr(wlogw) is the von Neumann en- 
tropy of the state u>. Von Neumann introduced this en- 
tropy by contemplating the disorder of quantum states 
so that it has the same meaning as the entropy in (phe- 
nomenological) thermodynamics in a setting where gases 
of molecules with quantum properties were considered 
[52l |. The factor fcln2 is just a conversion factor to make 
it consistent with the previous discussion of Landauer's 
principle. 
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The entropy change in the heat bath is equal to the av- 
erage heat transfer from the bath to the message system 
divided by the temperature: AS'bath = AQbath/T. The 
heat change in the heat bath is the same as that in the 
system with an opposite sign, i.e. AQbath = — AQ sys . 
The heat transfer can be done quasistatically so that the 
mechanical work required for the state change is arbitrar- 
ily close to 0. Therefore, due to the energy conservation, 
AQ sys must be equal to the change of internal energy 
of the message system AU sys , which can be computed as 
the change of average values of the Hamiltonian H before 
and after the erasure process. Hence, 

a AQsyg ^ At/gys 

^^bath = j, = ^ 

Tr(uH) - Tr(pH) 

- -^m. (18 , 

By using Eq. (fTr?)) , the Hamiltonian H can be expressed 
in terms of the partition function Z as — kT\n(Zu>). Now 
we have 

AS'bath = fcTr [(w - p) In(Ziv)] 
= fcTr [(uj — p) lnw] 
= -fcm2[S(w) + Tr(plogw)]. (19) 

Combining Eq. (flT)) and Eq. (fTi?]) gives the total entropy 
change after the erasure: 

AScrasurc = AS sys + AS'bath = -Tr(plogw), (20) 

where the unimportant conversion factor A; In 2 is set to be 
unity as a unit of entropy. The minimum of the entropy 
change AS ora surc can be obtained as: 

AScrasurc = -Tr(p log CJ) > S(p). (21) 

The inequality follows from the property of the quan- 
tum relative entropy, S(p||w) :— —S(p) ~Tr(ploguj) > 0. 
This minimum can be achieved by choosing the temper- 
ature of the heat bath and the set {pi,\<fii}} such that 
p = ^2 Pi\4>i) (4>i\ is the same as the thermal equilibrium 
state lu. Consequently, the minimum entropy increase 
required for the erasure of classical information encoded 
in quantum states is given by the von Neumann entropy 
S(p), where p is the average state of the system, instead 
of the Shannon entropy H (p) in the case of erasing infor- 
mation in classical states. 



VII. THERMODYNAMIC DERIVATION OF 
THE HOLEVO BOUND 

A. from the erasure principle 

Landauer's erasure principle, together with its 
Lubkin's version for quantum states, is simple in form; 



however, it implies some significant results in the theory 
of quantum information. For example, it can be used to 
derive the efficiency of the compression of data carried by 
quantum states [9| and also the upper bound on the ef- 
ficiency of the entanglement distillation process HH . 
Here, we look at the derivation of the Holevo bound from 
Landauer's principle, which was first discussed by Plcnio 
[9J, |54| , as we will examine the same problem from a dif- 
ferent perspective in the next section. 

To give a precise form of the Holevo bound let us con- 
sider two parties, Alice and Bob. Suppose Alice has a 
classical information source preparing symbols i = 1, ...,n 
with probabilities p\, ...,p n - The aim for Bob is to de- 
termine the actual preparation i as best as he can. To 
achieve this goal, Alice prepares a state pi with prob- 
ability pi and gives the state to Bob, who makes a 
general quantum measurement (Positive Operator Val- 
ued Measure or POVM) with elements Ej — Ei, .... E m , 
Y^JLi Ej = 1; 011 that state. On the basis of the measure- 
ment result he makes the best guess of Alice's prepara- 
tion. The Holevo bound [55[ is an upper bound on the 
accessible information, i.e. 

I(A:B)<S(p)-Y,PiS(pi), (22) 

i 

where I(A : B) is the mutual information between the 
set of Alice's preparations i and Bob's measurement out- 
comes j, and p = Pi Pi- The equality is achieved when 
all density matrices commute, namely [pi,Pj] = 0. 

Let us first consider a simple case in which all pi 
are pure: pi = | </>,]. The average state will be 
p = J2iPi\4'i}(' l l ; i\- Then, the Shannon entropy of the 
message is always greater than or equal to the von Neu- 
mann entropy of the encoded quantum state (Theorem 
11.10 in Ref. (H), that is, H( Pl ) > S(p), with equality 
if and only if (ipi\ipj) = o~ij for all i and j. 

How much information can Bob retrieve from the state 
pi The above analysis of erasure of information in quan- 
tum states tells us that we have to invest at least S(p) bits 
of entropy to destroy all available information. This im- 
plies that the amount of information that Bob can access 
is bounded by S(p), because if he could obtain S(p) + e 
bits of information then the minimum entropy increase 
by the erasure should be at least S(p) + e, by Landauer's 
principle. In other words, one cannot obtain more infor- 
mation than it could be erasable. 

Therefore, the accessible information I(A : B) is 
smaller than or equal to the minimum entropy increase 
by the erasure and it is this relation that corresponds to 
the inequality (|2"2"|) . If Alice encodes her information i 
into a pure state pt, the relation reads 

I (A : B) < S(p), (23) 

which is the same inequality as Eq. (f2"2")l because S(pi) = 
for all pure states pi. 

If Alice uses mixed states pi to encode i, Eq. (l2~lj) needs 
to be modified. Instead of Eq. (TIT)) , the entropy increase 
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for each state is now |l!6| AS£ = S(uj) — S(pj) after 
contact with a heat bath whose state is given by Eq. (fTo| . 
The average entropy change of the heat bath is the same 
asEq. p^]) : ASbath — — S(ui) — plogui. The total entropy 
change by the thermalization will be 

AS'erasurc = ^~^PjA5|y s + Affbath 

3 

= X>(SM - S( Pj )) - S(lo) - Tr(plogc) 

3 

3 

> S(p)-J2p 3 S(p 3 ), (24) 

3 

which, together with the above argument, implies the 
Holevo bound in the form of Eq. This analysis 

of erasure of information encoded with mixed states is 
more st raig htforward and less ambiguous than that in 
Refs. 

The above analysis thus justifies the Holevo bound. 
However, it does not give the precise condition for the 
equality in Eq. (|22|) , which is [pi,Pj] = 0. The condi- 
tion we can derive here is that all density matrices {p{\ 
support orthogonal subspaces, i.e. Tr(piPj) — 0. This is 
more restrictive than the commutativity of density ma- 
trices mentioned above. In the next section, we will see if 
the second law implies the Holevo bound more directly. 

B. from the second law 

Since the work-information duality in the erasure prin- 
ciple supports Brillouin's hypothesis, which we have seen 
in Section III CI on the equivalence between information 
theoretic and thermodynamic entropies, it might be nat- 
ural to expect that the second law may put a certain 
bound on the quality of information or the performance 
of information processing. In this section, we derive the 
general bound on storage of quantum information, the 
Holevo bound [55j derived from the second law of ther- 
modynamics. As the second law is the most fundamen- 
tal physical law that governs the behavior of entropy, this 
problem is interesting in terms of the spirit of the 'physics 
of information' and deserves to be investigated in its own 
right. 

In order to see the genuine thermodynamic bound, we 
need to minimize the axiomatic assumptions that stem 
from quantum mechanics. Assumptions we make here 
are, (a) Entropy: the von Neumann entropy is equivalent 
to the thermodynamic entropy, (b) Statics and measure- 
ment: a physical state is described by a "density" matrix, 
and the state after a measurement is a new state that 
corresponds to the outcome ("projection postulate"), (c) 
Dynamics: there exist isentropic transformations. 

Employing the density matrix-based description means 
that we presume the existence of superpositions of states. 



Allowing superpositions might sound rather abrupt; how- 
ever, we can assume that we are taking a similar stand 
as Gabor's picture on the possibility of superpositions 
purely from thermodynamic considerations. Thus, this 
could be stated in the other way around: assuming the 
superpositions of states, we can describe a state by a 
density matrix, which can be defined as a convex sum 
of outer products of normalized "state vectors" . The 
nonzero components of state vectors represent super- 
posed state elements. Probability distributions in clas- 
sical phase space can also be described consistently: all 
diagonal elements of a classical density matrix are real, 
representing probabilities, and all off-diagonal elements 
are zero. When a measurement is performed, one of the 
diagonal elements becomes 1, replacing all others with 0. 
We will use an arrow to denote a state vector, such as ip, 
to make it clear that we do not use the full machinery 
of the Hilbert space (such as the notion of inner prod- 
uct) and we never use the Born trace rule for calculating 
probabilities. 

Consider a chamber of volume V, which is divided into 
two regions of volumes p\V and P2V, respectively. The 
left-side region (L) is filled with p\N molecules in state 
ipi, and P2N molecules in state "02 are located in the 
right-side region (R). The two states, tpi and tp2, can be 
thought of a representation of an internal degree of free- 
dom. Generalizations to arbitrary numbers of general 
(mixed) states and general measurements are straight- 
forward. 

We can now have a thermodynamic loop formed by two 
different paths between the above initial thermodynamic 
state to the final state (Fig. [TTj). In the final state, both 
constituents, ip\ and ^2, are distributed uniformly over 
the whole volume. Hence, each molecule in the final state 
can be described by p — J^iPi^i^l^ regardless of the 
position in the chamber. One of the paths converts heat 
into work, while the other path, consisting of a quasi- 
static reversible process and isentropic transformations, 
requires some work consumption. 

In the work-extracting process, we make use of two 
semipermeable membranes, Mi and M2, which separate 
two perfectly distinguishable (orthogonal) states e\ and 
e*2 (= e^). The membrane Mi (i = 1,2) acts as a com- 
pletely opaque wall to molecules in e^, but it is transpar- 
ent to molecules in e^i- Thus, for example, a state ipi is 
reflected by Mi to become e\ with (conditional) probabil- 
ity p(ei\ipi) and goes through with probability p(e2\i>i), 
being projected onto e*2. This corresponds to the quan- 
tum (projective) measurement on molecules in the basis 
{e*i, e*2}, however, we do not compute these probabilities 
specifically as stated above. 

By replacing the impenetrable partition with the two 
membranes, we can convert heat from the heat bath into 
mechanical work W cx t, which can be as large as the ac- 
cessible information I (A : B), i.e. the amount of in- 
formation Bob can obtain about Alice 's preparation by 
measurement in the basis {ei, 62} [l!7| . The transforma- 
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FIG. 11: The thermodynamic cycle, which we use to discuss 
the second law. The cycle proceeds from the initial state (a) 
to the final state p (c) via the post-work-extraction state a 
(b), and returns to the initial state with a reversible process. 
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tion from the post-work-extraction state, which we call 
a hereafter, to the final state p can be done by a process 
shown in Fig. [T^] and the minimum work needed is given 
by AS = S(a)-S(p). 

Another path, which is reversible, from the initial state 
to the final state is as follows. Let {^1,^2} be an or- 
thonormal basis which diagonalizes the density matrix p, 
such that 



P 



k9k<l>k> 



(25) 



where are eigenvalues of p. We can extract S(p) bits of 
work by first transforming {tpi, ^2} to {0i, fo} unitarily, 
and second using a new set of semipermeable membranes 
that perfectly distinguish (f>\ and 4>2- 

If the initial state is a combination of mixed states with 
corresponding weights given by {pi,Pi}, the extractable 
work during the transformation to p = ^2 t PiPi becomes 
S(p) — ^2iPiS(pi). This can be seen by considering a 
process 



(i 



(ii) 



{Pi,Pi} — ► {Pi(J-),fij} — ► {Afe 



'fc } > P, 



where and {\k,(pk} are the sets of eigenvalues 

and eigenvectors of pi and p, respectively [56| . 

The function of the semipermeable membrane can al- 
ternatively be understood as a Maxwell's demon who 
controls small doors on a partition depending on the re- 
sult of his measurement of each molecule. Then do we 
need to consume some work to reset his memory? Unlike 
the previous discussions (such as that of Szilard's engine), 
it turns out that the demon's memory can be erased isen- 
tropically due to the remaining (perfect) correlation be- 
tween the state of each molecule and his memory regis- 
ters. This can be sketched as follows. Once the demon 
observes a molecule, a correlation between the state of 



FIG. 12: The thermodynamic process to transform the inter- 
mediate state a into the final state p. Firstly, after attaching 
an empty vessel of the same volume to that containing the 
gas a, the membranes Mj are used to separate two orthog- 
onal states ei and (( a ) to (c)). As the distance between 
the movable opaque wall and the membrane M2 is kept con- 
stant, this process entails no work consumption/extraction. 
As a — 'YlcjSje^, compressing each e^-gas into the volume 
of CjV as in (d) makes the pressures of gases equal and this 
compression requires S(a) — — Cj log 2 Cj bits of work. Sec- 
ond, quantum states of gases are isentropically transformed, 
thus without consuming work, so that the resulting state (e) 
will have XjN molecules in <f>j, where p = ^2Xj(pjfj is the 
eigendecomposition of p. To reach (f), S(p) bits of work can 
be extracted by using membranes that distinguish 4>j- As 
a result, the work needed for the transformation a — > p is 
S(a) - S{p) bits. 



the molecule and his memory is created. Since he can in 
principle keep track of all molecules, a perfect correlation 
between the state of the n-th molecule and that of the n- 
th register of demon's memory will be maintained. Then 
a controllcd-NOT-like isentropic operation between the 
molecules and the corresponding memory registers (with 
molecules as a control bit) can reset the demon's memory 
to a standard initial state without consuming work. 

The second law states (in Kelvin's form) that the net 
work extractable from a heat bath cannot be positive 
after completing a cycle, i.e. W ex % — W mv < 0. For the 
cycle described in Fig. [TTJ it can be expressed as 



I(A:B)<S(p)-J2PiS(pi)+^S, 



(26) 



where AS = S(<r) — S(p). As a is identical to the result- 
ing state of a projective measurement on p in the basis 
{ei, 62}, c = J2j PjPPj with Pj = ejCj and consequently 
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AS is always non- negative (See Ref. [13]). The inequality 
(|26[) holds even if the measurement by membranes was a 
generalized (POVM) measurement [561 ]. 

The form of Eq. ([H is identical with that of Eq. {22]) 
for the Holevo bound, except an extra non-negative term, 
AS. This illustrates that there is a difference between 
the bound imposed by quantum mechanics (the Holevo 
bound) and the one imposed by the second law of ther- 
modynamics. Namely, there is a region in which we 
could violate quantum mechanics while complying with 
the thermodynamical law. In the classical limit, the 
measurement is performed in the joint eigenbasis of mu- 
tually commuting pi's, consequently AS = 0, and, in 
addition, the Holevo bound is saturated: I (A : B) = 
S(p) — J2i PiS{Pi)- Thus, the classical limit and the ther- 
modynamic treatment give the same bound. 

The same saturation occurs when an appropriate 
collective measurement is performed on blocks of m 
molecules, each of which is taken from an ensemble 
{pi,Pi}. When to tends to infinity 2 m ^ s( -^-^^^ s ^ 
typical sequences (the sequences in which pi appears 
about Pirn times) become mutually orthogonal and can 
be distinguished by "square-root" or "pretty good" mea- 
surements [I?], El] . This situation is thus essentially clas- 
sical, hence, AS — ► and the Holevo bound will be sat- 
urated. 



VIII. ENTANGLEMENT DETECTION BY 
MAXWELL'S DEMON(S) 

Now let us move on to see how we can deal with entan- 
glement from the point of view of Maxwell's demon. The 
reason we pick up the topic of entanglement in particu- 
lar is that it is not only crucially important in quantum 
information theoretic tasks, such as quantum cryptogra- 
phy [59|, IfijJ, dense coding (olL quantum teleportation 
[62j ] and quantum computation [63L l64| , but it is also di- 
rectly linked to the foundations of quantum mechanics. 
However, the theory of entanglement is too broad and 
deep to explore comprehensively in this article. In addi- 
tion, the most of it appears irrelevant or at least unclear 
in the context of Maxwell's demon at any rate. Therefore 
we focus on only some recent work that have discussed 
the 'quantumness' of correlation and/or the problem of 
entanglement detection, which is one of the most impor- 
tant topics in its own right. 

When we discuss entanglement in this article we are 
primarily interested in bipartite entanglement, unless 
otherwise stated. Formally, entanglement is defined as 
a form of quantum correlation that is not present in any 
separable states. Let H A and TL B be the Hilbert spaces 
for two spatially separated (non-interacting) subsystems 
A and B, which are typically referred to as Alice and Bob, 
and H* 8 =H A ®H B the whole (joint) Hilbert space of 
the two. We also let S(Tt) denote the state space, which 
is a set of density operators acting on TL. 

A state of a bipartite system is said to be separable or 



classically correlated [65j if its density operator can be 
written as a convex sum of products of density operators 

n 

P = X>pf®pf, (27) 

i=l 

where all pi are nonnegative and Pi = ^ny state 
that cannot be written in the form of Eq. (|27|) is called 
entangled. We will let S scp denote the subspace that con- 
tains all separable states. 

It is natural to ask whether or not a given state 
p G SiTL^) is separable, considering the importance 
of ent angle ment in many quantum information theoretic 
tasks [118]. Quite a few separability criteria, i.e. a con- 
dition that is satisfied by all separable states, but not 
necessarily by entangled states, have been proposed so 
far to answer this simple question. Separability crite- 
ria are typically expressed in terms of an operator or 
a function, such as an entanglement witness [661 ] or the 
correlation function in Bell's inequality [67| • Despite the 
simplicity of the question, it is generally very hard to 
find good separability criteria. By a good separability 
criterion, we mean an efficient separability criterion that 
singles out as many entangled states as possible. The 
hardness of the problem is primarily related to the con- 
vexity of the separable subspace, which is formed by all 
the separable states: Because of the bulgy 'surface' of 
the separable subspace, there does not exist any oper- 
ator/function that is linear with respect to the matrix 
elements of density operator, e.g. eigenvalues of p, and 
distinguishes separable and entangled states perfectly. 

Another simple question is about the amount of entan- 
glement a pair (or a set) of quantum objects contains. It 
plays a major role when it comes to the characterization 
or manipulation of entanglement. Since entanglement 
can be regarded as a valuable resource in quantum in- 
formation processing, the quantification of entanglement 
is a problem of great interest and importance. Despite 
its profoundness, we will not go into details on the quan- 
tification issue here: Instead, we refer interested readers 
to some references, such as Refs. (68l. [61 FtoL fTl . 
Also, Ref. [5l| contains not only a review on entangle- 
ment measures, but also some discussions on quantum 
information processing from the thermodynamic point of 
view. 



A. Work deficit 

In this subsection, we will review the concept of work 
deficit, which was introduced by Oppenheim et al. [74| . 
An apparent goal of this work was to quantify entan- 
glement via a thermodynamic quantity; this idea shed a 
new light on the quantumness of correlations by taking 
a thermodynamic approach. 

As we have emphasized, information is always stored 
in a physical system with physical states that are distin- 
guishable by measurement so that stored information can 



18 



be extracted. No generality is lost when we think of a 
gas in a chamber, such as the one considered by Szilard, 
as a general information-storage apparatus. Even if we 
had a different type of physical system for information 
storage, the information can be perfectly transferred for 
free to the memory of the type of Szilard's engine, if the 
initial state of Szilard's engine is provided in a standard 
state. Since the measurement can be done with negligible 
energy consumption (as we have seen in Section [TTTJ) , the 
information transfer can be completed by converting the 
state from the initial standard state to the state corre- 
sponding to the stored (measured) information; the final 
conversion requires no energy. 

Now that we have identified a memory with the one 
molecule gas of Szilard's engine we can present a gen- 
eral statement: from an ensemble of memories, each of 
which stores the value of an n-bit random variable X, 
one can extract mechanical work whose average amount 
per single memory register is (by taking units such that 
fcln2 = l) 

W c = n - H(X), (28) 

where H(X) is the Shannon entropy of X. The ex- 
tractable work is the work done by the gas for memory, 
thus it is nothing but Eq. (fj| when n — 2. Equation l|28p 
can be easily understood in the following way. Suppose 
there are N memory registers. If we measure all N regis- 
ters the remaining uncertainty in the memory is zero; we 
can obtain Nn bits of work. Nevertheless, we still keep 
the information due to the measurement on memory and 
this needs to be erased to discuss solely the amount of 
cxtractablc work. The minimum energy consumption to 
erase the information is, according to the erasure prin- 
ciple, equal to NH(X) bits. Thus the maximum total 
amount of extractable work is given by N(n — H(X)) 
bits. Alternatively, one can use the first law of thermo- 
dynamics to arrive at the same expression as Eq. (|28p . 
The work done by the gas in an isothermal process is 
equal to the entropy change multiplied by the tempera- 
ture. 

The same argument is applicable to work extraction 
from quantum bits (qubits). Let p be the density opera- 
tor describing the state in a given ensemble. Qubits after 
(non-collective) measurements are in a known pure state, 
which is essentially a classical system in terms of infor- 
mation. Thus the information stored in this set of pure 
states can be copied to the Szilard-type memory and each 
register can give us 1 bit of work. Then, after erasing the 
information acquired by the measurement, the net max- 
imum amount of work we get becomes 1 — S(p) bits of 
work. 

The work deficit is a difference between the globally 
and the locally extractable work within the framework 
of LOCC, i.e. local operations and classical communi- 
cation, when p is a system with spatially separated sub- 
systems. Suppose that we have an n qubit state pab, 
which is shared by Alice and Bob, then the optimal work 



extractable is 

Wg lobal = n-S(p AB ), (29) 

if one can access the entire system globally. On the other 
hand, we let Wi oca i be the largest amount of work that 
Alice and Bob can locally extract from the same system 
under LOCC. The deficit A is defined as 

A = global - Wiocai (30) 

In order to grasp this picture, let us compute the 
deficits for a classically correlated state 

Pcl B = ^(|00)(00| + |ll)(ll|) (31) 

and a maximally entangled state 

|^ B ) = -L(|00> + |11>). (32) 

The globally extractable work Wg' obal from p AB is simply 
1 bit. The locally extractable work Wj ocal turns out to be 
also 1 bit. The protocol is as follows. Alice can measure 
her bit in the {|0), |1)} basis and send the result to Bob, 
who can obtain 1 bit of work from his bit. Although Alice 
can extract 1 bit of work from her own bit, using her 
measurement result, she needs to consume this energy to 
erase the information stored in the memory, which was 
used to communicate with Bob. Thus, the deficit for 
the state p AB is A c i = 1 — 1 = 0. The locally extractable 
work is the same, i.e. 1 bit, even if the state is maximally 
entangled as in Eq. ([32]) . However, as this state is pure 
globally, we can have Wg lobal = 2 - S{\$ AB ) (<P AB |) = 2 
bits, therefore A ent = 2 — 1 = 1. 

These two simple examples suggest that the 'strength' 
of correlation could be reflected in the deficit, though the 
deficit might not be necessarily the amount of entangle- 
ment. In fact, the authors of 74] propose later in a more 
detailed paper [75| that the (quantum) deficit can be in- 
terpreted as the amount of quantumness of correlations, 
not entanglement. 

It has been shown in 74] that the deficit is bounded 
from below as A > max{S(p A ), S(p B )} — S(p) (under an 
assumption about the classicality of the communication 
channel), where p A and p B are the reduced density op- 
erators, i.e. p A = Trgp and p B = Tr^p. The bound (or 
the upper bound for Wi oca i) can be achieved when the 
state is pure and it turns out to be equal to the entan- 
glement measure for pure states. This is simply because 
a pure state can be written as \4>) = a i\ e i)\fi) m t ne 
Schmidt decomposition and then A = S(p A ) = E(ip), 
where p A — Trs|?/')('0| and E(-) is the entanglement mea- 
sure for pure states. 

A similar approach has been taken in an attempt to 
quantify the amount of entanglement in Ref. [76| . There, 
the (asymptotically) minimal amount of noise added 
to the system to erase the correlation was examined: 
roughly speaking, it can be characterized by the number 
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of allowed operations from which we choose randomly to 
make the given state separable. In the discussion above 
on deficit, the correlation is converted into work and the 
purity of the system is destroyed. Instead, noise is added 
actively here, and the information about the chosen op- 
erations is erased in the end, dissipating entropy into the 
environment. 



B. Quantum discord 

A similar approach to measuring the quantumness of 
correlations has been taken byZurek [77] by using the 
concept of "quantum discord" [78| , which was introduced 
by him, and a Maxwell's demon. Let us recall the defini- 
tion of the mutual information between two systems, A 
and B, in classical information theory: 



I(A : B) = H(A) + H(B) ~ H(A, B) 



(33) 



J B (A : B 



i\B k )}) 



terized by discussing the locally extractable work from a 
heat bath via a given state, without comparing it with 
globally extractable work. This is possible despite the 
fact that the optimal locally extractable work from a pair 
is the same for both types of correlation, as we have seen 
in Section IVIII Al the difference can manifest if we do 
not optimize the work in a single setting for extracting. 
Thus the inequality obtained here works as an entangle- 
ment witness [66J with locally observable thermodynamic 
quantities. 

Suppose that two parties (or demons), Alice and Bob, 
choose their measurement basis as Ag = {P^P^} and 
Bqi = {Pgi, Pgr}, respectively, where 9 (9') represents the 
direction of the basis. Alice performs her measurement 
with Aq on her qubits and sends all results to Bob. Then 
Bob can extract 1 — H(Bg/\Ag) bits of work per pair on 
his side after compressing the information of his measure- 
ment outcomes, where H(X\Y) is the Shannon entropy 
of X, conditional on the knowledge of Y. Only when the 
shared system is in a maximally entangled state, such as 
|*+) = (|00) + |11»/V2, H{A e \Bg) can vanish for all 9. 
That is, we can extract more work from entangled pairs 
than from classically correlated pairs. 

Let us choose more measurement bases in order to min- 
imize the dependence of the work on the particular choice 
= H(A) + H(B) - [H(B) + H(A\B)] { \^^f bases. Alice and Bob first divide their shared ensem- 
ble into groups of two pairs to make the process sym- 
metric with respect to each of them. For each group, 
they both choose a projection operator randomly and 
independently out of a set, {A\, ■ ■ ■ , A n } for Alice and 
{Bi, ■ ■ ■ , B n } for Bob, just before their measurement. 
Then, Alice measures one of the two qubits in a group 
with the projector she chose and informs Bob of the out- 
come as well as her basis choice. Bob performs the same 
on his qubit of the other pair in the group. As a result 
of collective manipulations on the set of those groups for 
which they chose Ai and Bj, they can extract a maxi- 
mum of 2 — H(Ai\Bj) — H(Bj\Ai) bits of work per two 
pairs (See Fig. [T3|) . 

Next, we add up all the work that can be obtained by 
continuously varying the basis over a great circle on the 
Bloch sphere, i.e. the circle of maximum possible size 
on a sphere. This is similar in approach to the chained 
Bell's inequalities discussed in Ref. [7!| . The circle should 
be chosen to maximize the sum. Thus, the quantity we 
consider is 



To clarify the quantumness later, we substitute the defi- 
nition of the joint entropy H(A, B) = H(A) + H(B\A) = 
H(B) + H(A\B) into Eq. ([33]) to define the locally mutual 
information as 



= H(A) + H(B)-H B (A,B {lBk)} ) 



where the subscript B and {|Pfc)} are used to stress that 
we are accessing the system B locally by using the basis 
{\Bk)}- Now the discord is defined as 

S(A\B {lBk)} ) = I(A:B)-J B (A:B {lBk)} ) 

= H B (A,B {{Bh)} )-H(A,B). (35) 

Here the (basis-independent) joint entropy H(A 1 B) is 
given by the von Neumann entropy of the whole state 
p™, i.e. H(A,B) = Sip^) = -Trp^loga/ 3 . 

The discord defined here is the work deficit A in 
Eq. ([30)1 when the measurement is done in the basis 
{|P/c)} only on the subsystem B after one-way commu- 
nication (from A to B). Zurek described this scenario 
as the comparison of work-extraction efficiency by clas- 
sical and quantum Maxwell's demons: a classical de- 
mon is local, while a quantum demon can perform mea- 
surements on the whole system in a global basis in the 
combined Hilbert space. The difference in efficiency of 
work extraction is equal to the discord 5(A\B^ Bk ^), if 
the classical demon employs {|Pfe)} as his measurement 
basis. Thus, the least discord over all (local) measure- 
ment bases, i.e. 5{A\B) = rah\{\ Bk ^8(A\B {\ Bk )}) , coin- 
cides with the work deficit A when only one-way com- 
munication is allowed. Obviously, more communication 
only helps to increase the classically extractable work, so 
5 > A is a general upper bound on A. 



C. Thermodynamic separability criterion 

The degree of correlation, particularly the difference 
between classical and quantum ones, can also be charac- 



S(P) := 



1 

2^ 



S„(A(0),B(e))do, 



(36) 



where £ P (A{0), B{0')) = 2 - H(A(9)\B{9')) - 
H(B(9')\A(9)) is the extractable work from two 
copies of p in the asymptotic limit when Alice and Bob 
choose A{9) and B(9 r ), and 9 is the angle representing a 
point on the great circle. Then, we can show that S(p) 
can be used as a separability criterion: an inequality 

B(p) < B(|00» (37) 

is a necessary condition for a two-dimensional bipartite 
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FIG. 13: Schematic view of the protocol to extract work from 
correlated pairs. Two pairs in the figure represent an ensemble 
for which Alice and Bob use Ag and Bgi for their measurement 
and work extraction. For a half of this ensemble, Alice mea- 
sures her state with Ag and Bob extracts work from his side 
along the direction of 9' , according to Alice's measurement 
results. For the other half, they exchange their roles. 



state p to be separable, that is, p — J2iP*pf ® Pf ■ 
The state 1 00) in the right-hand side of Eq. ([37)) can be 
any pure product state We obtained the value 

of S(|00)) numerically as 0.8854 bits. We refer to this 
inequality (|37[) as a "thermodynamic separability crite- 
rion" . The proof of t his p roposition is based on the con- 
cavity of the entropy [80] ■ 

The integral in Eq. ([55)) can be performed over the 
whole Bloch sphere, instead of the great circle, to get 
another separability criterion. Let "E-bs denote the new 
integral, then, Eq. (|3"7) becomes Ebs(p) < 2bs(|00)), 
where Sss(|00)) can be found numerically as 0.5573. 
The proposition above about the separability holds for 
Sbs(p) as w ell- Let us now compute the value of 
Zbs(pw), where 



(38) 



is the Werner state [6a], to see the extent to which the 
inequality can be satisfied when we vary p. It is known 
that the Bell-Clauser-Horne-Shimony-Holt (Bell-CHSH) 
inequalities 81| are violated by pw when p > l/y/2 = 
0.7071. On the other hand, pw is inseparable if and 
only if p > 1/3, according to the Peres-Horodecki cri- 
terion [13, H3|. A bit of algebraic calculation leads to 
Zbs(pw) = (1 -p)log 2 (l -p) + (1 +p)log 2 (l +p) and 
this is greater than S^sdOO)) when p > 0.6006. There- 
fore, the inequality for Ebs is stronger than the Bell- 
CHSH inequalities when detecting inseparability of the 
Werner states. This difference, we suspect, is due to the 
nonlinearity of the witness function, which is S in this 
case. Related analyses are presented independently in 
Refs. [U, [ID, H(| from the point of view of the entropic 
uncertainty relations. 



IX. PHYSICAL IMPLEMENTATIONS OF THE 
DEMON 

Apart from our own (limited) interests, there are of 
course a myriad of other interesting works on Maxwell's 
demon in the quantum regime, particularly on the physi- 
cal implementations of the work-extracting engine under 
control of the demon. Let us briefly review some of them 
in this section. 

Lloyd [87| proposed an experimental realization of 
Maxwell's demon using nuclear magnetic resonance 
(NMR) techniques. A spin-1/2 particle prepared in a 
standard state, e.g. | |), works as the demon (memory) 
to store the information in a given state, which is also a 
spin-1/2 particle. Extracting work is done by applying 
a 7r pulse (to flip it in the {] f), | J.)} basis) at the spin's 
precession frequency u> = 2p,B/K, where /i is the mag- 
netic moment of the spin and B is the external magnetic 
field: a photon of energy hut will be emitted to the field 
when the spin flips from the higher energy state | f ) to 
the lower energy state | J,). If each of two spins uses a 
heat bath at different temperatures, then we can consider 
a cycle that performs work in analogy with the Carnot 
cycle. The quantumness comes into the discussion of the 
inefficiency of the cycle, compared with the ideal Carnot 
case, which is due to the entropy increase by (quantum) 
projective measurements. That is, more entropy increase 
is needed to erase the original information stored in the 
demon's memory. 

A practical realization of Lloyd's analysis was proposed 
in Ref . jB8j , in which superconducting qubits [89| , instead 
of spin-1/2 particles, are used to manipulate informa- 
tion/energy transfer. The necessity of two heat baths at 
different temperatures is represented by a temperature 
gradient between two qubits and the sequence of actual 
operations is presented there. 

The idea of directly using energy stored in two- 
level atoms was presented by Scully and his coworkers 
[90L I9TL [92l . l93l ]. In their scenario, two- level atoms are 
first randomized in a hohlraum, which is a hollow cavity 
that thermalizes the energy levels of incoming atoms, and 
next they are separated by a Stern-Gerlach-type appara- 
tus into two spatial paths. Useful energy could be ex- 
tracted from atoms in the excited state, and then atoms 
in the two paths are combined. At the final stage the 
which-path information is erased isothermally, dissipat- 
ing heat (entropy) into the environment, to recycle atoms 
for another cycle. 

In summary, all these physical cycles involve a process 
that merges two different physical paths representing two 
logical states. The merger of two paths corresponds to 
the logically irreversible process that we discussed in Sec- 
tion IIII1 as was also emphasized in Ref. [94f , and thus 
leads to an entropy increase in the outside world of the 
information carrier. 

Scully [95| further proposed another type of heat en- 
gine, which has important quantum aspects. Instead of 
the two-level atoms in the above example, three-level 
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atoms are now used to provide useful work (energy) to 
the radiation field in a cavity. The atom has one excited 
level | a) and two nearly degenerate ground levels, \b) and 
|c). The atoms are initially prepared to have some small 
population in the level \a) and a coherent superposition 
between \b) and |c), that is, its density operator is given 
by Po =Pa\a)(a\ + (l-p a )\g){g\, where \g) = c b \b)+c c \c) 
(\cb\ 2 + |c c | 2 = 1). The amplitudes, c& and c c , as well as 
the cavity frequency are tuned so that the probability of 
transition from \g) to \a) vanishes [ll9j |. 

An interesting consequence of Scully's idea is that 
quantum coherence, in \g), could be useful to enhance 
the efficiency of the thermodynamic cycle even beyond 
the Carnot efficiency. This is because it could be possi- 
ble to extract work, in the form of photons, from a single 
heat bath. Such a scenario of extracting work from a sin- 
gle heat bath is indeed reminiscent of Szilard's demon- 
assisted one-molecule engine in Section Hi Bl In Scully's 
engine, quantum coherence plays the role of the demon. 
Because of the suppressed absorption of photons by the 
atoms, cold atoms absorb less than they would in the 
absence of coherence, while hot atoms do emit photons. 
Hence there is a sorting action, which could be seen as a 
demon's maneuver. As in Szilard's case, entropy needs to 
be dissipated when resetting the state of the demon. In- 
cluding the entropy cost for initializing the atoms in the 
total entropy bill ensures the validity of the second law. 
A more detailed physical implementation was studied in 
Ref. [HI. 

Another noteworthy example, also proposed by Scully, 
might be the quantum heat engine that makes use of 
the difference in energy gaps of a three-level atom [9?], 
[98j . By combining maser and laser cavities to control the 
population of each level, it could be possible to devise a 
Carnot-type or Otto-type heat engine and calculate the 
upper bounds on their efficiency. 

Kieu proposed an idea of a related, but different, type 
of engine that consists of a two-level potential well [9!| • 
Work-extracting cycles can be done by manipulating the 
parameters for the potential, such as its width and depth. 
Then the relationships between the temperatures of the 
heat baths, the change in energy levels, and the ex- 
tractable work are analyzed, confirming the validity of 
the second law in the quantum regime. This type of 
idea was considerably extended to more general cases and 
scrutiniz ed in term s of quantum Carno t and Otto engines 
in Refs. [To^, [lOl||. The work in Ref. [ToJ also provides 
a succinct and pedagogical presentation of quantum heat 
engines. 

When it comes to the demon in the quantum world, 



there is also an interesting analysis on Landauer's era- 
sure principle in the quantum regime. Reference [l02t | 
discussed the validity of the principle when entanglement 
is taken into account due to the interaction between the 
memory system and the environment. However, they 
identified the Clausius inequality and the erasure prin- 
ciple directly, and showed that the Clausius inequality 
could be violated because of entanglement. This seems to 
be incompatible with the results in Refs. @, Hl|, where 
the Clausius inequality was not used to derive the bound. 



X. CONCLUDING REMARKS 

Since his birth in the late 19th century, Maxwell's de- 
mon has surely been enjoying watching scientists strug- 
gling with his paradox. Nevertheless, he has led us to a 
new paradigm over the past century, i.e. the interplay of 
physics and information theory. 

To conclude this article, we wish to re-stress that re- 
alizing the irreplaceable reciprocity between physics and 
information has given rise to a number of implications 
in the foundations of not only quantum mechanics, but 
also gravity. This may be suggesting that information 
would help us merge quantum mechanics and gravity 
since Maxwell's demon is playing his game at the very 
core of both theories. Moreover, the interplay has been 
a powerful driving force in the development of quantum 
information science. 

We probably had better prepare for more 'demonic' 
intellectual challenges as more revolutionary paradigm- 
shifts might be expected to come in any fields of natural 
sciences. Therefore it should be still too early to pre- 
sume the demise of the demon with plenty of mysteries 
in nature lying in front of us. 
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[106] The load should be varied continuously to match the 
pressure so that the expansion be a quasistatic and re- 
versible process, and this enables the pressure to be ex- 
pressed as p = kT/V. 

[107] By including the temperature T, we hereafter use the 
same unit 'bit' for both entropy and thermodynamical 
work. 

[108] The idea of negative entropy itself was introduced by 
Schrodinger to discuss living systems that keep throw- 
ing entropy away to the environment. It was renamed 
as negentropy by Brillouin, who associated it with in- 



formation. 

[109] Equal probabilities for Ppre-meas (or Ppost-mcas) possible 
states are assumed. 

[110] Although, at first sight, taking AS := Sf — Si seems 
more natural to express the second law, the subscripts 
only represent the state either before or after a mea- 
surement that provides us with information on the sys- 
tem (bound information), not a physical time evolution. 
Therefore the change in entropy due to physical evolu- 
tion should be written as ASf. 

[Ill] Otherwise, the molecule can always be detected near 
the edge of the illuminated region. This region can then 
be made as thin as the size of the molecule to maximize 
the work-extracting efficiency. 

[112] Von Neumann defined the entropy S of a quantum state 
p by a simple thermodynamic consideration [H^| . Sup- 
pose a vessel is filled with an ideal gas, every molecule 
of which is in the state p. One now decomposes the gas 
into the set of gas components, each of which is in a pure 
state \ipi), with the semipermeable membranes. The en- 
tropy S(p) is defined (up to a constant factor) as the 
minimal thermodynamic entropy increase in the envi- 
ronment that is necessary to transform the initial state 
to the final state, where every molecule is in the same 
pure state and is distributed uniformly over the whole 
vessel. The zero entropy for any pure state is postulated. 

[113] This means that the membrane Mi_ performs a mea- 
surement about the property A just before the molecule 
hits it and the molecule's post-measurement property 
will become Aj with probability q{Aj,B^). 

[114] A message can be any set of alphabets. It refers to a 
word/letter sent from sender to receiver, information 
which is stored in memory, etc. We assume that the in- 
formation source generates independent and identically 
distributed variables/alphabets according to the proba- 
bility distribution {pi}. 

[115] One may be tempted to use the averaged message state 
p as the pre-erasure state. However, this is not the right 
way of viewing it. Before the erasing procedure, the en- 
coder, who prepared the state, or the memory itself still 
knows which of {|^»)} it is in. Information erasure is a 
process that destroys correlations between the memory 
and the encoder or the system accessing to it, by trans- 
forming the state to a standard state (i.e. u ~ |eo)(eo| 
in Lubkin's erasure), irrespective of the initial state. 
In other words, there must be a perfect correlation or 
knowledge before the erasure, which will be lost after- 
wards. Averaging over an ensemble means that even the 
encoder already lost information about his/her prepara- 
tion. Hence, in this case, the entropy of the pre-erasure 
state should be taken as 0. Considering the classical 
counterpart (Fig.[3ja)) may be useful to understand this 
reasoning. 

[116] The conversion factor fcln2 is set to be equal to unity 
again as a unit of entropy. 

[117] The justification of this equivalence described in Fig. 1 
of Ref. |56j has an error in identifying entropy changes in 
the process. Nevertheless, this equivalence between Wext 
and I(A : B) can be seen correct by a bit of straightfor- 
ward calculations, using the state equation for an ideal 
gas. 

[118] Looking at entanglement as a resource of quantum infor- 
mation processing naturally suggests the way to quan- 
tify entanglement in terms of its usefulness for such 
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tasks. This leads to the idea of distillable entangle- 
ment, i.e. the average number of maximally entangled 
pairs that can be distilled from a given pair using only 
local ope rations and classical communication (LOCC) 

[liM liS fl . 



[119] Such a coherent trapping in \g) occurs due to the de- 
structive quantum interference between two transi tions , 
namely, \b) —* \a) and |c) —* \a). See Chapter 7 of |l05| ]. 



